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A?jALr5ING POLYNUCLECTiDE SEOLJE>JCES 
INTRODUCTION 

Three fnethcc:;s dominate molecular analysis c* 
nucleic acid sequences: eel electrophoresis of 
restriction fracr.ents, molecular hybridisation, end the 
rapid ONA seouencing f7\ethods. These three methccs have 
a very wide range of cDOlications in biology, bcth in 
basic studies, and in the applied areas of the subject 
such as medicine and agriculture. Some idea of the 
scale on which the methods are now used is given by the 
rate of .accumu 1 a t i on of DNA sequences, which is now 
''^ well over one million base pairs a year. However, 

powerful as they are, they have their limitations. The 
restriction fragr.ent and hybridisation methods give a 
coarse analysis cf an extensive region, but are rapid; 
sequence analysis gives the ultimate resolution, but it 
is. slow, analysing only a short stretch at a time. 
There is a need for methods which are faster than the 
present methods, and in particular for methods which 
cover a large amount of sequence in each analysis. 

This invention provides a new approach which 
produces both a fingerprint and a partial or complete 
sequence in a single analysis, and may be used directly 
with complex ONAs and populations of RNA without the 
need for cloninc. 

In one aspert the invention provides apparatus for 
analysing a polyr.ucleotide sequence, comprising a 
suoport and attcc*ied to a surface therof an array of 
the whole or a chosen part of a complete set of cligo- 
nucleotides of cr.csen lengths, the oligonucleotides 
being capable of taking part in hybridisation 
-"^ reactions. For studying differences between poly- 
nucleotide sequer.:e3, the invention provides in another 
aspect apparatus comprising a support and attache: to a 
surface therof a* array cf the whole or a chosen :art 
c-' 1 c:~.sl=te sei rf c 1 i goivjc i eot ides cf c-csen '.znctr.s 
ccr.crisinc the p : i ynuc I eot i de secuences, the cli::- 



p.'jciecticies being C2p = b!e cf takinc part 
"■lybriuisetion reactions. 

In another aspeci, the invention provides a metho 
cf analysing a polynucleotide sequence, by the use cf 
support tc the surface of which is attached an array o 
the whole or a chosen part of a complete set of oligo- 
nucleotides of chosen lengths, which method comprises 
labelling the polynucleotide sequence or fragments 
thereof to form labelled material, applying the 
labelled material under hybridisation conditions to th 
array, and observing the location of the label on the 
surface associated with oarticular .-embers cf the set 
of ol igonucleotides . 

The idea of the invention is thus to provide a 
structured array of the whole or a chosen part of a 
complete set of oligonucleotides of one or several 
chosen lengths. The array, whichmay be laid out on a 
supporting film or glass plate, forms the target for a 
hybridisation reaction. The chosen conditions of 
hybridisation and the length of the oligonucleotides 
must it all events be 3;j--:cient for the available 
eouipment to be able to discriminate between exactly 
matched and mismatched cl igonucleotides . In the 
hybridisation reaction, the array is explored by a 
labelled probe, which may comprise oligomers of the 
chosen length or longer :)olynucleotide sequences or 
fragments, and whose naiurs depends on the particular 
application. For example, the probe may comprise 
labelled sequences amplified from genomic ONA by th? 
polymerase chain reacti:n, cr a mflNA population, or a 
complete set of oligonucleotides from a complex 
sequence such as an entire genome. The end result is 
set cf filled cells cor r.es::ond i no to the 
- 1.: pcn'j: 1 ect i dss presen- in the analysed secjence, cH 
z iri z'' "eT.pty" sites ::rrespondinc tc the sequence: 
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w h i : h a r « a >) s ? n t in t r. e a .- a i y « e r £ e c. u 2 r. c e . T r. 2 ; 5 « e r r. 
"roduces a ringerrrin ": represer. ^ir.g ^11 cT tr. e 35c je nee 
analysed. In addition, it is pcssible to asser.ble "ost 
or all or t^^e seq'jer.ce analysed if an oligonucleotide 
5 1-nsth is chosen such tnat racst or all oligonuclectide 
sequences occur only once. 

The number, the length and the sequences of the 
oligonucleotides present in the array "lookup table" 
also depend on the application. The array oay Include 
TO £ii possible oligonucleotides of the chosen length, as 
would be required if there was no sequence infer ration 
cn the sequence to be analysed. In this case, the 
preferred length of oligonucleotide used depends on the 
length of the sequence to be analysed, and is such that 
^5 there is likely to be only one copy of any particular 
oligomer in the sequence to be analysed. Such arrays 
are large. If there is any Inforaation available on 
the sequence to be analysed, the array may be a 
selected subset. For th* analysis of a sequence which 
20 is known, the size of the array is of the same order as 
length of t."«e sequence, and for zany applications, such 
as the analysis of a gene for c--ations, it can be 
:uite saall. These factors are discussed in detail in 
vhat follows. 
25 I. OLIGONUCLEOTIDES AS SEQUEKCE PHOBES 

Oligonucleotides fors base paired duplexes with 
cligonucleetides which have the coaplenentary base 
sequence. The stability of the duplex is dependent on 
tr. e length of the oligonucleotides and on base 
3C rcr.positior. . Effects of 'base crzpositicn on duplex 
itability can be greatly reduced cy the presence cf 
r. i5h concentrations of quaternary or tertiary acsines. 
However, there is a strcng effect of r. issatches ir. the 
: 1 i 5 0 r. u cl e c t i d e s duplex on the t r. e r ca 1 stability c :" the 
iz - y s r i d , and it is this which = a >; e s the technique of 
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n y b r i c i 5 = t i c n v 1 1 c 1 i g o n u c 1 e o 1 1 d e s £ - t = p c v e r f u 1 

e " r. o c : ' o r c r. e analysis o T .i u t a 1 1 c r. £ , ^ r. t f r r ^ ^. * 
selcccicn of specific sequences for azrlification by 
DNA polyneras* chair. reac::.ion. The cos*. ::ion of the 
5 r. is=at. = r. affects the degree of destabilisation. 

Misaatches in tt^e centre of the duplex cay cause a 
lowerir.5 of the Is by 10°C compared with i^C for a 
terainal misaatcr. , There is then a range of 
discririnatins power depending on the position of 
rsiscatch, which has implications for the aethod 
described here. There are ways of i=r roving the 
discr irina t ir.g pcwer, for exaaple by carrying out 
hybridisation close to the Tm of the duplex to reduce 
the rate of formation of mismatched duplexes, and by 
^5 increasing the length of ol igonuc 1 ec t i z e beyond what is 
required for unique representation. A way of doing 
this systematically is discussed. 
3. ANALYSIS OF A PR EDETEBMINED SEQUENCE 

One of the aost powerful uses of oligonucleotide 
20 probes has been in the detection of si.-gle base changes 
in huzar. genes. The first example v=.£ -he detecticr. cf 
the sir.gle base change in the betaglotir. gene which 
leads ro- sickle cell disease. There is a need to 
extend this apprcach to genes in which there cay be a 
25 number zC different mutations leadir.r tc the same 

phanotyre, for example the DMD gene ar.i the HPRT gene, 
and to find an efficient way of scar.r. ir.g the huaan 
gencze for mutations in regions whicr. have been shown 
by lir.k2.ge analysis to contain a disease locus for 
30 exaaple Huntington's disease and Cystii "ibrosis. Any 
known sequence can be represented completely as a set 
cf overlapping cligo nucleotides. The *i;e of the set 
is K s - 1 N , V r. e r • N is the 1 e n g t r. c f ^ t h e sequence 
and s 13 the lengtr^ cf an oligor.er. .-. gene cf : kc :c 
3^ exarttli. ray be div id-d in to an over.arpir.s set cf 
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a - o u r. d c r. s * * c - £ 2 r. i c 1 i g c n u c 1 e o t i d e £ c f any c r, ; s e n 
lenst^.. An arr-/ c c r.s r u c t e d vi::r. eacr. cf ir.e£- 
cligonuclec-ices in a separate ceil can be '^se: as a 
multiple ^. ytriiisaticn prcbe to examine the r. crolc5cu£ 
^ sequence in my context, a single-copy gene in the hur.an 
genome or a messenger HNA axong a mixed RNA pcpulation, 
for exarnple. The length s -ay be chosen such that 
there is only £ snail probability that any oiiroaer in 
the sequence is represented elsewhere in the sequence 
'C to be ana'iysed. This can be estimated fron the 

expression g i ■/ e n in the section discussing statistics 
belcw. For a less cor-plete analysis it wouli "ce 
-cssible to reduce the size of the array e.g. by a 
factor of up tc 5 by representing the sequence in a 
0 partly or ncn -c ve rlap? ing set. The advantage cf using 
a completely overlapping set is that it provides a aore 
precise location of any sequence difference, as the 
oisnatch will scan in £ consecutive oligonucleotides. 
ANALYSIS OF AN UNDETERMINED SEQUENCE -. 
20 ._, The genozss of all free living organises are 

larger than a zillion base pairs and none has yet been 
sequenced cc=;ietely/- Restriction site napping reveals 
only a small part of the sequence, and can detect only 
a small portirn of mutations when used to ccrpare two 
25 genomes. Mere efficient methods for analysing complex 
sequences are needed to bring the full power z : 
molecular genetics to bear cn the many biological 
problerns for which there is no direct access zz the 
gene or genes involved. In many cases, the full 
'0 secuence cf tne nucleic acids need not be ceterinined; 

the iapcrtant sequences are those which differ between 
two nucleic acids. To give three examples: tne DNA 
£ecuence£ vnitn are different between a wil.i type 
crganis- z't : ne which carries a mutant can lead the 
-= -av to isclititn cf the relevant gene; si.iilarly. the 
5iC-er. re differences •between a cancer cell =nt its 
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no real c c u r. e r p s r t - a r. reveal the cause of z r a r. s r o r - 

a c i o n ; and the N' A c ■* ^. u e n c e s which c i f f e r e t - e e - v - 
ceil types point tc tr. e functions which distin^uis- 
tnen- These prowlers can be opened to nolecuiar 
5 analysis ty a -ethci vhich Identifies sequence 

differences. Using tne approach outlined here, such 
differences can ^:e revealed by hybridising the two 
nucleic acids, for exssple the genomic DNA of the twc 
■ genotypes, or the nHSA populations of two cell types tc 
*>0 an array of oligonucleotides which represent all 

possible sequences. Positions in the array which are 
occupied by ere sec -j* nee but not ty the other show 
differences in two se:uences. This gives the sequence 
infcrnation needed tc synthesise probes which can then 
15 be used to isolate cltnes of the. sequence involved. 
U,i ASSEMBLING "tKE SI;UENCE INFORMATION 

Sequences can be reconstructed by examining tte 
rftsu.lt of hybridisatisn to an array. Any oligonuclec- 
tlde of length s' frcr" within a long sequence, overlaps 
20 with two others over 1 length s-1. Starting fro= eacr. 

positive cligonuclectiie, the array -ay be exariinei fir 
the four o 1 igonucl e c t i c es to the left and the four tc 
the right that can cvsrlap with a one base displacement 
If only one of thes* :cur oligonucleotides is found tc 
25 be positive to the rirht, then the overlap and the 

additional base to the right determine s bases in the 
unknown sequence. The process is repeated in both 
directions, seeking unirue matches with o the r . pos i t i v 5 
c i'igonucleo tides . in tne array. Each unique natch ^tti 
30 a base to the reconstructed sequence. 
11.2 SOME STATISTICS 

Any sequence cf length H can be broken down tc = 
set cf overlacpinc sequences s base pai*rs in 

*er. cth. vFcr double itrandsd nucleic acids, the 
3 5 S3cjence complexity £ sequence of .V base pairs 
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are dirferen: *ec.:ence c c = b i na t i on f . How b 5 si-ould 

s fee to ensure tr. at -cs;-, oilzonuclectiles will be 
represented only cnce in "ne sequence to te ar.al/sed, 
of coniplexity .N*? "or a randoa sequence tne expected 
nu.-nber of s-mers whicr. will be present in acre than one 
copy is 

P>i s: + A)) 

where 

A = (,V-i + i)/4' 



For practical reasons it is also useful to know 
how aany sequences are related to any given s-oer by a 
single base change. Each position can be substituted 
by one of three bases, there are therefore 3s sequences 
related to an individual s-rer by a single base change, 
and the probability that any s-mer in a sequence of N 
bases is related to any other s-mer in that sequence 
2Z allowing one substitution is 3s x N/i**. The relative 
signals of - a t c r. e c and :t i s r £ t c n e d sequences will then 
depend on how good the hybridisation conditions are in 
distinguishing a perfect =atcr. frcn: one which differ by 
a single base, (If is an order cf magnitude greater 
2: than N, there shculd only be = few, 3s/10, related to 
any oligonucleotide by one base change.) The 
indications are that the yield of hybrid from the mis- 
matched sequence is a fracticn of that fo r=ed by the 
perfect duplex . 
II For what fellows, it is assuoed that conditions 

can be found which allow c 1 i g 0 n u c 1 e c t i i e s which have 
c o.T.p 1 er.e n t s in the probe tc be distinguished fron: these 
which do not. 

: : 7 o f o r an idea c f t n e stile c f t n e arrays n e e c e d 



\ 

\ 
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_ _ 



10 



15 



20 



25 



30 



ar.aiyse sequences of diCCer^m ccrplexiiy ir 
c o n V e r. i e n w c o l r* i n k of the array as = s v: • j = r e - a • i x . 
Ail sequences cf a ;:iven lenstn can repre re n.:c jus:, 
once i a - a c r i x constructed by drawing f c u r r c w s 
representins the fcur bases, follovei ty Cz-j^r si-ilar 
cciurns. This produces a 4 x ^ oatrix in which each of 
the l6 squares represents one of the 16 doublets. four 
3i2ilar niatrices, !?ut one quarter the size, are then 
drawn within each of the original squares. This 
produces a l6 x l6 matrix containing all 256 tetra- 
nucleotide sequences. Repeating this process produces 
a =5trix of any chosen depth, s, with a nu racer of cells 
equal to , As discussed above, the choice of s is of 
great inportance, as it determines the conplexity of 
the sequence representation. As discussed below, s 
also deteraines the size of the matrix constructed, 
which must be very big for complex genomes. Finally, 
the length of the oligonucleotides determines the 
hybridisation con'litions and their discriminating power 
as nvbridisaticn probes. 
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T'ns sfiows *ri2 ?xrsct£— seals -r. 2 arrays 

n22-5- tc rsrforr. tr. s first analysis cf 1 f^v ^sr, c.*r;ss. 
The examples were choser because they are r?-omes wMcn 
nave eitr. er t>eer. sequenced by conventional rrocei'^res - 
5 the ccsnjic scale -, are in the process of being 

sequenced - the E. ccii scale or for wr.izn there has 
ccr. siderable discussion of the magnitude cf the proble.-:; 
- the hu car. scale. the table shows that the expected 
scale of the matrix approach is only a ssall fraction 
10 of the conventional approach. This is readily seen in 
the area of X-ray fil::i that would be consumed. It is 
also evident that the tiae taken for the analysis would 
be only a szall fraction of that needed for gel 
methods. The "Genomes" column shows the length of 
'(5 random sequence which would fill about 5% cf cells in 

the matrix. This has been determined to be the optimum 
condition fcr the first step in the sequencing strategy 
discussed below. At this sire, a high proportion of 
the positive signals would represent single occurrences 
2C each clizoctr, the conditions needed tc compare two 

genomes fcr sequence differences. 
5. REFINEMENT OF AN INCOMPLETE SEQUENCE 

Reconstruction of a complex sequence produces a 
result in which the reconstructed sequence is 
25 interrupted at any point where an oligomer that is 
repeated in the sequence occurs. Some repeats are 
present as components of long repeating structures 
wnich for" part of the structural crgar.isaticn cf the 
DNA , dispersed and tandum repeats in hurar. Z)\k for 
-A example. E-t when the length of ol igo nuc 1 e c t ide used 
in the matrix is smaller than that neecei tc give 
totally ur. icue sequence representation, repeats occur 
by chance. Such repeats are likely to be izclated. 
That is, t r. » sequences surrounding the r * t * 1 1 e c 
-= cligor.ers are unrelated to eacn other. tne gaps cause 
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t y these repeats c 3 n be r e .r. o v e t by e x t s r. - i r. g n e 

Jec'jence to Is-^er cligonfers. ?ri..-cipie. ti-cse 

sequences shown to repeated by the fir si analysis. 

ucing an array representaticn of all possible 

^ cli5or^ers» could be resynthesised with an extension it: 

each end. For each repeated oligomer, there would be 

X M = 16 olisocers in the new matrix. The 

hvbricisation analysis would now be repeated until ihe 
• -I 

sequence was ccsplcte. In practice, because the 
''0 results of a positive signal in the hybridisation cay 

T ^-birtious. it say be better to adopt a refinszer, * of 
ihe first result by extending all sequences which tit 
not give a clear neratlve result in the first analysis. 
An advantage of this approach is that extending the 
15 sequence brings sisratchcs which are close to the er.ds 
in the shorter oligoscr, closer to the centre in the 
extended oligomer, increasing the discriminatory cover 
of duplex formation. 

5.1 A HYPOTHETICAL AKALYSIS OF THE SEOUENCE OF 
20 3ACTEP.I0PHAGE \ D.VA 

Lambda phage DKA is U8,502 base pairs long. Its 
sequence has been corplctely determined, we have 
treated one strand cf this as a test case in a ccrputer 
simulation of the analysis. The table shows that the 

25 appropriate sire cf oligomer to use for a sequence :: 
this complexity is -he 10-mer. With a matrix of 't- 
aers, the size was -324 lines square. After 
"hybridisation" . c f the lambda lO-oers in the cocp-^tsr, 
^6,577 cells were rrsitive, 1^57 had double 

30 occurrences, 75 triple occurrences, and three cuizr'-pie 
occurrences. These U6,377 positive cells representee 
Knovn sequences, determined from their po*^tior. ir. tr.e 
satrix. Each was extended by four x one base at tr. s 2 
end and four x z't tEse at the 5' , end tc give *. '1 
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n 'J ."2 b e r c f c o is i e c c c u r r e r. t « s o 'i 6 1 , a f i: r t n e r ". 6 * 
fold e X I e 3 i c - t r c u 5 r. : - e - u - "c e r down ^ o C . = r. c c r. e 
.■nore provided a complete; y overlapped result. Of 
c curse, the sarrie er.^; rs£ui-, cf a fully overlapped 
5 sequence cculd be achieve; starting wit J: a ii ' .-atrix, 
but the ciatrix wcuic be -C:o times bigger than the 
matrix needed to represer. t all 10-raers, and tnost of the 
sequence represented cn it would be redundant. 
5.2 LAYING DOWN THE MATHIX 
*iC method described r. ere envisages that the 

:tatrix will be produced "v «yn thesis! ng cl i gonuc 1 ec t i d es 
in the cells or an array iv laying down the precursors 
for the four bases in a predetermined pattern, an 
example of which is described above, Autosatic 
•5 eq ulpoent . f or applying the p recur sor's "hTs yet to be 
developed, but there are civious p"ossibilites; it 
should not be difficult tc adapt a pen plotter or other 
computer-controlled printing device to the purpose. 
The smaller the pixel aize cf the array the better, as 
2C complex genomes need very large numbers of cells. 

However, there ar e ' liol t s tc how small these can be 
made. 100 microns would a fairly comfortable upper 
limit, but could probably r.ct be achieved on paper for 
reasons of texture and dirfusion. On a smooth 
25 impermeable surface, sucr. is glass, it may be possible 
to achieve a resolution tf around 10 microns, for 
example by using a laser typesetter to preform a 
solvent repel Ian t S'^lt, zr.z cuilding the oligonucleo- 
tides in the exposed rericr.3. One attractive 
'-2 possibility, which allov^ idaptation of present 

techniques of oligonuclertide synthesis, is to sinter 
-icroporous glass in ricmzcpic patches onto the 
surface cf a glass plats. Laying down very large 
r. m d fi r d f lines or d c t s t ; u 1 d take a 1 c n r tire, if t h 2 
printing nechanism v-src sltv. However, a low ccst ink- 
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Je; or irizer can print ac speeds c:' aricu^ 'O.COC spots 
per second. Wit.^. this sort of speed, *. C'' spots could 
be printed i r. aiout three hours. 

5.5 OLIGONUCLEOTIDE SYNTHESIS 

There are several r.ethods of synthesising 
oligonucleotides. . Most nethods ir. current use attach 
'the nucleotides to a solid support of controlled pore 
*iO size glass ( C?C ) and are suitable for adaptation to 

synthesis on a glass surface. Although we know of no 
cescription of the direct use of rligc nucleotides as 
hybridisation probes while still attached to the aatrix 
on which they were synthesised, tr.ere are reports of 
o the use of oligonucleotides a s • hy b r idisa t ion probes on 
solid supports to which they were attached after 
synthesis. PCT Application WO 85/01051 describes a 
method for synthesising oligonucl ec tides tethered t a 
CFG colucn. In an experiment performed by us, CPG was 
20 used as the support in an Applied Bio-'syteas 

oligonucleotide synthesiser to syr.thesise a 13-mer 
ccrp lementary to the left hand ccs site of phage 
l=rbda. The coupling steps were =11 close to 
theoretical yield. The first base was stably attached 
25 tc the support medium through all the synthesis and 
deprotection steps by a covalent link. 

6. PROBES, HYBRIDISATION AND DltlCTION 

The yield of oligonucleotides synthesised on 

30 ritrcporous glass is about 50 \iz.zi/Z' ^ patch cf this 

raterlal 1 cicror. thick by 1C nicrcr.s square would held 
- 1 2 

^ 3 X '.0 \iraoi, equivalent tc i = ouf2 g cf human 
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D N A . The r, y b r i c i s a t i c r. r e s c t i c r. could i - r r e :" c r e be 
carried cj " with a very 1 a r 5 e excess c f *, - e b c u n 0 
o 1 i gonuc 1 c 0 t i e 5 ever tliat in the probe. So it should 
be possible tc -iesisn a systen capable cf 
5 distinguish ir.g between h y b r i d 1 s a t i 0 r. involving single 
and nultiple occurrances of the probe seo-:ence, as 
yield will be proportional to concentration at all 
stages in the reaction. 

The polynucleotide sequence to be analysed may be 

10 of DNA or HNA. To prepare the probe, the polynucleotide 
may be degraded toform fragments. Preferably It Is 
dsgrsdsd by a scthcd which is as random es possLbl^, to 
an average length around the chosen lengtr. s of the 
oligonucleotides on the support, and oliroaers of exact 

15 * length s selected by electrophoresis on = sequencing 
gel. The probe is then labelled. For eT.= =ple, 
oligonucleotides of length s may be end labelled. If 
labelled with ?. the radioactive yield cf any 
individual s-mer even from total human DKA could be 

20 more than 10 dpo/mg of total DNA. for oetection. only 
a small fraction of this is needed in = lEtch 10-100 
aicrons souare. This allows hybr idlsati: - conditions 
to be chosen to be close to the Tm of duplexes, which 
decreases the yield of hybrid and decreases the rate of 

25 foriatior. , but increases the discriminatir.g power. 

Since the bound oligonucleotide is in exoess, signal 
need not be a problem even working close to 
ecuilibriuz. 

Hybridisation conditions can be chcser. tc be those 
30 known tc be suitable in standard procedurss used to 
hybriiisc to filters, but establishing ot-iruz 
conditions is iaportant. In particular, -e^perature 
needs to be controlled closely, preferably to better 
than Particularly when the cncsen length of 

35 the oligc nucleotide is sea 11, the analysis needs to be 
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able to i 3 1 i nc'j 1 3h between siicnt dirr'erences cr rate 
dnd/oV exient nvbr i d i s e t : cn . The equipr.en: .-£y need 
to be pr0Grc:7in€d :cr differences in base coDpo5it :n 
bezween difr'ereni oligonucleotides. In constructing 
5 the array, it ~cy be preferable to partition this into 
sub-.-atr ices wit^. sirr.ilar base compositions. This may 
rT\aice it easier tc define the Tm which may differ 
slightly according tc the base composition. 

Autoradiccrapy , especially with P causes image 
decradation whicn may be a limiting factor deterr:ining 
resolution; the limit for silver halide films is around 
25 microns- Obviously some direct detection system 
would be better. Fluorescent probes are envisaged; 
given the high concentration of the target 
15 oligonucleotides, the low sensitivity of fluorescence 
may not be a problem. 

We have considerable experience of scanning auto- 
radiographic images with a digitising scanner. Our 
present design is capable of resolution down to 25 
20 microns, which cculd readily be extended down to less 
than present application, depending on the quality of 
the hybridisatior. reaction, and how good it is at 
di stinguishing absence of a sequence from the presence 
of one or more. Devices for measuring astronomical 
25 plates have an accuracy around 1 .Scan speeds are 
such that a matrix of several million cells can ce 
scanned in a few minutes. Software for the analysis of 
the data is straight-forward, though the large Ciz^ 
sets need a fast computer. 
30 Experiments cresented below demonstrate the 

feasibility c-' '"e claims. 

Commercially available microscope slides (BI^H 
Super Premium 76 x 2^5 x 1 mm) were used as "^jpccr'S. 
These were cerivE-ised with a long aliphatic ;ins=r 
II that can withstar.: the conditions used for the 
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deprot ec t i or. of zt^>B ^ro.-a::c ^^iBrocyclic ucSrS. i.e. 
30% NH^ at 55 r'cr 10 r.curs. The linker, brcrinu 5 
hycJroxyl group which serves as a starting point for the 
subsequent o 1 i oor.uc 1 ect i de , is synthesised in two 
steps. T.^e slides are first treated with a 25- 
solution of 3-clycidoxyprGpyltriethoxysi lane in xylene 
containing several droDS cf Hunig's base as a catalyst. 
The reaction is carried out in a staining jar, fitted 
with a drying tube, for 20 hours at 90°C. The slides 

are washed with MeOH, Et 0 and air dried. Then neat 

2 

hexaet^iy 1 ene glycol and a trace amount of cone, 

0 

sulphuric acid are added and the mixture kept at 80 

for 20 hours. The slides are washed with MeOH, Et 0 , 

0 2 
air dried and stored desiccated at -20 until use. 

This preparative technique is described in British 

Patent Application 8822228,6 filed 21 September 1988. 

The oligonucleotide synthesis cycle is performed 

as f ol lows : 

The coupling solution is made up fresh for each 
step by mixing 6 vol. of 0.5M tetrazole in anhydrous 
acetonitrile with 5 vol, cf a 0,2M solution of the 
required beta-cyanoethylphosphoramidite. Coupling time 
is three minutes. Oxidation with a 0.1M solution 
of ^2^" THF/pyridine/H^O yields a stable phosphc- 
triester bond. Oetr i ty 1 Et i on of the 5' end with 3% 
trichloroacetic acid in d i chl oromethane allows further 
extension of the oligonucleotide chain. There was nc 
capping step since the excess of phosphoramidi tes usee; 
over reactive sites on the slide was large enough to 
drive the couoling to ccr.:leticn. After the synthesis 



20 



50 



is completed, the oligonucleotide is deprotected in 30' 

0 

NH^ for 10 hours at 55 . The chemicals used in the 
coupling step are moi stur^- sen s i t i ve , and this critical 
step must be performed unter annydrcus conditions in c 
^caiPd rnntainer. as follcws. The shape of the patch 
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"c Ds synt-9sised wss cji cjt. o: c s.^. eet or si: i cone 
r-j^ber ( 75 x 26 x 0 . 5 ^n) which was sindwicr.e:; between 
a .nicrosccpe slide, derivatised as described cbcve, and 
; piece of teflon of the sane size and thickness. Tc 
tnis was fitied a snort piece cf plastic fjbinc that 
allowed us tc inject and withdraw the coupling solution 
by syringe and to flush the cavity with Argon. The 
who-le assembly was held together by fold-back paper 
clips. After coupling the set-up was disassembled and 
the slide put through the subsequent chemical reactions 
(oxidation with iodine, and detri tylation by treatment 
vith TCA) by dipping it into staining jars. 
EXAMPLE 1 . 

As a first example we synthesised the sequences 
oligo-dT -oligo-dT^^ on a slide by gradually decreasing 
the level of the coupling solution in steps 10 to 14. 
Thus the 10-mer was synthesised on the upper part of 
the slide, the 14-mer at the bottom and the 11. 12 and 
i3-mers were in between. W^^used 10 pmol oligo-dA^^. 
labelled at the 5' end with P by the polynucleotide 
kinase reaction to a total activity of 1.5 million 
c.p.m,. as a hybridisation probe. Hybridisation was 
carried out in a perspex (Plexiclas) container made to 
-■it a microscope slide, filled with 1.2 ml of 1M NaCl 
in TEv O.n SOS. for 5 minutes c" 20 . After a short 
rinse in the same solution withrjt oligonucleotide, we 
were able to detect more than 2CjO c.p.s. with a 
radiation monitor. An autorac 1 : graph showed that all 
the counts came from the area were the oligonucleotide 
-ad been synthesised, i.e. there was no non-specific 
cindlng to the glass or to the region that had been 
cerivatised with the linker only. After partial 
elution in 0.1 M NaCl differential binding to the 
tirget is detectable, i.e. less binding tc tne shorter 
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slide in the wash solution we deterriined the T {.-Tiid- 
point o: transition when 50% eluted) :c be 33 ! There 
were no counts detectable after incubation at 39 
The hybridisation end melting was repeated eiohi times 
with no diminution o' the signal. The result is 
reproducible. We estinicte that at least 5% of the 
input counts were taken up by the slide at each cycle. 
EXAMPLE 2. 

In order to determine whether we would be able to 
distinguish. between rr.atched and mismatched oligo- 
nucleotides we syathesised two sequences 3' CCC GCC GCT 
GGA (ccsL) and 3' CCC GCC TCT GGA, which differ by one 
base at position 7. All bases except the seventh were 
added in. a rectangular patch. At the seventh base, 
half cf the rectangle was exposed in turn to add ttie 
two different bases, in two stripes. Hybridisation of 
cosR orcbe oligonucleotide (5* GG6 CGG CGA OCT) (kinase 
labelled with P to 1,1 million c.p.m., 0.1 M NaCl, 
TE, CM SOS) was for 5 hours at 32 , The front of the 
slide showed 100 c.p.s. after rinsing. Autoradiography 
showed t^.at annealing occurred only tc .he part of the 
slide with the fully complementary ol iccnucleot ide . 
No signal was detectable on the patch with the 
mismatched sequence. 

EXAMPLE '3. 

Fcr a further study of the effects of mismatches 
or shorter sequences on hybridisation behaviour, we 
constructed two arrays; one (a) of 24 cligonucleotides 
and the ether (b) of 72 oligonucleotide;. 

Thr£e arrays were set out as showr. in Table 1(a) 
and <(b'.. The masks used to lay down tr.sse arrays were 
di-'erer.- from those used in previous er.cer iments . 
Lengths silicone rubber tubing { Ir.r. :.c.) were clued 
witr. :i::ccne rubber cement to the sur-'icz cf olair 
"icrcsccce slides, in the form of a "U". Clamping 
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these fr.as:<s £Ccir. st e derivatised microscope slice 
produced c cavi*. y into whicn the couplinc s:l'j*icn was 
introduced tnrcucn a syringe, in this way cniy the 
part of the slide within the cavity came into contact 
- with the phosphoramid i te solution. Except in the 

positons of the mismatched bases, the arrays listed in 
Table 1 were laid down using a mask which covered most 
of the width of the slide. Off-setting this r.ask by 
3mm' up or down the derivatised slide in subsequent 
coupling reactions produced the ol ignucleotides 
truncated at the 3' or 5' ends. 

For ths ir. troduction of mismatches s mask was used 
which covered half (for array (a)) or one third (for 
array (b)) of the width. of the first mask. The bases 
at positions six and seven were laid down in two or 
three longitudinal stripes- This led to the synthesis 
of oligonucleotides differing by one base on each half 
(•array^(a)) or third (array (b)) of the slide. In 
other positions, the sequences differed fror. the 
longest sequence by the absence of bases at the ends. 

In array (b). there were two columns of sequences 
between those shown in Table l{b)» in which the sixth 
and seventh bases were missing in all positions, 
because the slide was masked in a stripe by t'-.e 
^5 silicone rubber seal. Thus there were a total zf 72 
d i fferent sequences represented on the slide 1* 90 
different positions. 

The 19-mer 5' CTC CTG AGG AGA AGT CTG C used 
for hybridisation (2 million cpm^ 1.2 ml 0. v. \£C1 in 
30 TE . 0. 1% SOS. 20*') . 

The washing and elution steos were fcllrw^l by 
autoradiography. The slide was kept in the washing 
solution for E r.ln at each elution steo and thrn 
exposed (45 r.ir., intensified), Elution ter.;e'atures 
3- were 23. 56, ^2. «7. 55 and 60 C respectively. 
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A3 indicated in ihe table, the oligonucleotides 
showed different meltin: behaviour. Short olico- 

0 

nucleotides melteo before longer ones, and at 55 C, 
only the perfectly rnatched 19-mer was stable, all oiher 

' oligonucleotides h^d been eluted. Thus the method cen 
differentiate between a 18-mer and a 19-mer which 
differ only by the absence of one base at the end. 
Mismatches at the end of the oligonucleotides and at 
internal sites can all be melted under conditions where 
the perfect duplex remains. 

Thus we are able to use very stringent 
hybridisation conditions that eliminate annealing to 
mismatch sequences or to oligonucleotides differing in 
length by as little as one base. No other method using 

^5 hybridisation of ol igcnucleotides bound to the solid 

supports is so sensitive to the effects of mismatching. 
EXAMPLE 4. 

To test the application of the invention to 
diagnosis of inherited diseases, we hybridised the 
2^' array (a), which carries the oligonucleotide sequences 
specific for the wild type and the sickle cell mutati:ns 
of the P -globin gene, with a 110 base pair fragment :■ 
ONA amplified from the p -globin gene by means of the 
polymerase chain reaction (PCR). Total DNA from the 
25 blood of a. normal individual (1 microgram) was 

amplified by PCR in the presence of appropriate prime- 
oligonucleotides. The resulting 110 base pair fragnert 
was purified by e lecr::hores i s through an agarose gel. 
After elution, a small sample (ca. 10 picogram) was 
30 labelled by using -"?-dCTP (50 microCurie) in a 
second PCR reaction. This PCR contained only the 
upstream priming o I i c cr.uc I ect i de . After 60 cycles c: 
amplification with an extension time cf 9 min. the 
product was remcvec :recurscrs by eel Mltra-i:-- 

r.ci e I or r ^nnhnro<; i c r- t-:-ip radioflrtive oroduct showec i 



wo S9/109"' 



PCT/GBS9/U0460 



n a J 0 r b a n a corresponding in i e r. ; i h t c t € • G b c e 
rragment. One q-jerter of this zroauct (100, COO c,:.m. 
in 0,9 M NaCi. IE. 0 . 1 *i SOS) was hybridised to t.^e 
array (a). After 2 hours at 30 ca. 15000 c.p.m. had 
been takan up. The mottinG behaviour of the hvbrids 
was followed as described for the IQ-mer in example 3, 
and it was found that the melting behaviour was similar 
to 'that of the oligonucleotide. That is to say, the 
mismatches considerably reduced the melting temperature 
of the hybrids, and conditions were readily found such 
that the perfectly matched duplex remained whereas the 
mismatched duplexes had fully r.elted. 

Thus the invention can be used to analyse long 
fragments of ONA as well o 1 igcnucl eotides , and this 
example shows how it may be used to test nucleic acid 
sequences for mutations. In particular it shows how it 
may be applied to the diagnosis of genetic diseases. 

EXAMPLE 5. 

To test an automated syste:n for laying down the 
precursors, the ccsL oligonucleotide was synthesised 
with 11 of the 12 bases added in the way described 
above- For the addition of tre seventh base, however, 
the slide was transferred intc an Argon filled chamber 
containing a pen plotter, T.^e pen of the plotter had 
been replaced by a component, -abricated from Nylon, 
which had the same shape and dimensions as the pen, but 
which carried a polytetraf lucr:ethylene (PTFE) tube, 
through which chemicals could ce^ delivered to the 
surface of the class slide whi:h lay on the bed of the 
plotter. A microcomputer was -jsed to control the 
plotter and the syringe pump vr.ich delivered the 
chemicals. The pen, carrying the delivery tube from 
the syringe, was moved into crziticn above t.*:e slide, 
1^2 pen was lowered and the p*s."r activated tr lay down 
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G. T end A phosphorem i d i te soluiicns en erray c ■* -.welve 
spcts was Icid down in three groups c* four, vitr. tnree 
different oligonucleotide sequences. After 
^lybr i c i s at i on to cosR, as described in Examnie 2, and 
^ auucrad iography , signal was seen only over the four 

spots of perfectly matched oligonucleotides, where the 
dG had been added. 

In conclusion, we have demonstrated the following: 

1. -It is possible to synthesise oligonucleotides In 
good yield on a flat glass plate. 

2. Multiple sequences can be synthesised on the sample 
In small spots, at high density, by a simple manual 
procedure, or automatically using a computer controlled 
device. 

15 3^ Hybridisation to the oligonucleotides on the plate 
can be carried out by a very simple procedure. 
Hybridisation is efficient, and hybrids can be detected 
by a short autoradiographic exposure. 

4. Hybridisation is specific. There is no detectable 
20 signal on areas of the plate where there are no 

ol iccr.ucleotides . We have tested the effects of mis- 
matched bases, and found that a single mismatched base 
at any position in oligonucleotides ranging in length 
frcr^ '.2-mer to 19-mer reduces the stability of the 
25 hybri: sufficiently that the signal can be reduced to a 
very low level, while retaining significant 
hybr : : i s at i on to the perfectly matched hybrid. 

5. The oligonucleotides are stably bound to the class 
and rlates can be used for hybridisation repeatedly. 

50 7^e invention thus provides a n:vel way of 

analyzing nucleotide sequences, whic^. should find a 
wide -ance of application. We list i number of 
potent* I applications below: 
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... S'^.c i ! crreys of c 1 i Gcnuc I eo'- i de s es ' i r;;erpr i nt i 

and tacrine locis 

Analysis c"' kncv^n mutations including genet;: diseases. 
Example - above shews how the invention .T.ay be 
^ us?d to enciYse nutations. There are many 

■ applications for such a method, including the detection 
of inherited diseases. 

Gencmic fingerprinting. 

In the same way as mutations which lead to disease 
^ can be detected* the method could bemused to detect 

point mutations in any stretch of ONA. Ss-^uences are 
now available 'or a nun-ber of regions containing the 
base differences which lead to restriction fragment 
length pclymcrphisms (RFLPs), An array of oligo- 
5 nucleotides representing such polymorphisms could be 
made from pairs of oligonucleotides representing the 
two allelic restriction sites. Amplification of the 
sequence containing the RFLP, followed by hybridisation 
to the plate, would show which alleles were present in 
^ the test genome. The number of ol igonuclect ides that 
could be analysed in a single analysis could be quite 
large. Fifty pairs made from selected clleles would 
be enough t: give a fingerprint unique to er. individual. 
./Linkage analysis. 
'5 Applyir.: the method described in the last 

paraaraoh tc a pedigree would pinpoint recrr.- inat ions . 
Each pair vT' spots in the array would give t-e 
i nf crrr.ct i cr. is seen in the track c^" the RFl? 

analysis, using gel electrophoresis and blci'ing, that 
JO is now rcu--i".ely used for linkage studies. It should 
oe possible 'O analyse nany alleles in a single 
analysis, by hybridisation to"a"n array of allelic pairs 
of oligonucleotides, greatly simplifying tKe methods 
used to fine linkage between a ONA pc lyrr.crir : sm and 
35 phenotypic .-. = rker such as a disease gene. 
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.-^ethcd we have developed and confirr;ied by experiments. 
Larce arrays o^" olioonucieotides as seqiience 
reading tools. 

We have shewn thou oligonucleotides can be 
^ svnihesised in small patches in precisely deter.-ri i ned 
positions by one of two methods: by delivering the 
precursors through the pen of a pen-plotter, or by 
masking areas witu silicone rubber. It is obvious how 
a pen plotter could be adapted to synthesise large 
arrays with a different sequence in each position. For 
some applications the array should be a predetermined, 
limited set; for ether applications, the array should 
comprise every sequence of a predetermined length. The 
masking method can be used for the latter by laying 
^5 (jown the precursors in a mask which produces 

intersecting lines. There are many ways in which this 
can be done and we give one example f or 'i 1 lustration : 
1, The first four bases. A, C, G, T, are laid in four 
broad stripes on a square plate- 
20 2. The second set is laid down in four stripes equal 
in width to the first, and orthogonal to them. The 
array is now composed of all sixteen dinucleotides . 
3. The third and fourth layers are laid down in f:ur 
sets of four stripes one quarter the width of the -Irst 
25 stripes. Each set cf four narrow stripes runs wit"in 
one of the broader stripes. The array is now conpc3ed 
of all 256 tetranuclectides . 

d. ,Ihe process is repeated, each time laying down two 
layers with stripes which are one quarter the widt^ of 
the previous two layers. Each layer added increciTS 
the length cf the c 1 i conuc 1 eot i de s by one base, c"-- "-^^ 
number of different cl igonuc 1 eot ide sequences by c 
factor of four. 

The dimensior.s cf such arrays are determinew 
35 tne width of the s-ri:5S. The narrowest stripe 
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have Icic is l.-im, but this is cleeriy net ihe lowest 

: . HI 1 V. . 

There are 'jseful applications for arrcys •' n which 
pari: of tr.e sequence is predeterrniriec anc Pcrc rna<ie up 
^ of all possible sequences. For exar.ple: 
Character: s ins mRNA populations.- 
Mcsi mRNAs in higher eukaryotes have the sequence 
• AAUAAA close to the 3' end. The array used to analyse 
mRNAs would have this sequence all over the plate. To 
^'^ analyse a mRNA population it would be hybridised to an 
;iprav comocsed of ell sequences of , the type N AATAAAN . 
For m + n = 8, which should be enough to give a unique 
oligonucleotide address to nost of the several thousand 
mRNAs that is estimated to be present in a source such 
•5 as a mammalian cell, the array would be 256 elements 
square. The 256 x 256 elements would be laid on the 
AATAAA using the masking method described above. With 
stripes of around 1mm, the array would be ca. 256mm 
square . 

20 This analysis would measure the complexity of the 

mRNA population and could be used as a basis for 
comparing populations frorr. different cell types. The 
advantage of this approach is that the differences in. 
the hybridisation pattern would provide the sequence of 

^5 oligonucleotides that could be used as probes to 

isolate ell the mRNAs that differed in the populations. 
Sequence determination. 

To extend the idea to determine unkcvn sequences* 
using an array composed cf ='ll possible o 1 i gonuc 1 eot i d? 

3^ of a chosen length, requires larger arrays than we hav= 
synthesised to date. However, it is possible tc scale 
down the size of spot and scale up the n-umbers to th.os 
required by extending the -ethods we have developed an 
tested on small arrays. Cur experience shows that t^ 

-= i c much simpler in c^eration than the gel basec 



wo 89/10977 



1 5 



20 



30 



cz - 
rA3L£ 



PCT/GB89/0046U 



r L. I 




il^ 1 C J 


3 oricj -1 orr 






" C ^ o 






• w i 


1 5 ' 








r * r 




TCC 


TCT 


ACG 






?n 




i" * r 


J r i" 


TCT 


ACG 






- «^ 


^ » /- 


HAT 






r-AC 






?n 




r. * r 




TCT 




r 




35 






TCC 


TCT 


AGA 


CG 








« /» 




TCT 


AfiA 

/» VJ ry 






A T 




(:AL 




TCT 


TAG 






JO 


etc 


r.tr 




TCT 


CAG 


AGG 




60 










TTA 


V;Al 


G 


/ 7 
** / 


TAT 


r * r 


acL 




TCA 


GmC 


G 






GAC 


Trr 


TTT 


Tf- * 




r- 
O 


4^ 


. AG 


* ^ 


aGL 


TCT 


TCA 


GAC 


G 






GAC 


TCC 


TCT 


TCA 


GAv. 


<- 

G 




. . G 


G-Al 


acc 


TCT 


TCA 


GAC 


G 






GAC 


TCC 


TCT 


1 v-rt 


GAC 


G 


42 




GAC 


aCC 


TCT 


TCA 


GAC 


G 






.AC 


TCC 


TCT 


TCA 


GAC 


G 


36 




.AC 


aCC 


1 G 1 


Tr A 
1 GA 


GAC 


G 


35 




. .C 


TCC 


TCT 


TCA 


GAC 


G 


35 




..C 


aCC 


TCT 


TCA 


GAu 


G 








TCC 


TCT 


TCA 


GAC 


G 


36 






aCC 


TCT 


TCA 


GAC 


r- 
\z 








r r 


TCT 


T.- 
1 


GAC 




36 






= CC 


TCT 


TCA 


GAC 


r. 


Fcr 




Tiple 


3 array 


(b) was 


se 


t out 


as follows: 











20 TC 
20 Gi^: GAt TCC 
20 G^G Gi,t TCC T 
20 GiG Gi«: TCC TC 
20 G^G GS.t TCC Ta 
20 G^ GAt TCC TC7 T 
20 GV^ S;t TCC TCT Tg 
20 G^ G^t TCC TCT TCA 
22 G=G Git TCC TCT TCA G 
32 G^ G^t TCC Ta TCA G^ 
42 GiG G;t TCC TCT TCA G^C 



20 GIG G^C TC 
20 GAG G^C TCC 
20 GiC TCC T 
20Ga6G2CTCCTC 
20 GSG G6L TCC TCT 
20 G^G G^C TCC TCT T 
20 G3G G^C TCC TCT TC 
20 GfiG G2C TCC TCT TC; 
A2 G^G G^C TCC TCT TC^ G 
47 GaG GflC TX TCT TCA G^. 



20 G^G G^C aC 
20 G5GG5C aCC 
20 G^ GflC aCC T 
aOG^GaCaCCTC 
20 GIG Gac aCC TCT 
20 GIG G^C aCC TCT T 
20 GSG G=C aCC TCT TC 
20 GflG GflC aCC TCT TCA 
20 GfiG G^C aCC TCT TCA G 
32 G^G G6C aCC TCT TCA G^ 
42 GSG G6C aCC TCT TCA G^C 



52 GSG G^C TCC TCT TCA G<: 
52G^S^Ta:TCTTCAG2C660GflGGsCTGCTCTTCAG^G52G5GG5^ 
42 ./^G G^t TCC TCT TG^ GAC G 52 G^C TCC TCT TLA &::C 6 42 .AG G^ 
^ to: TCT TCA &SC G 52 ..G G^C TCC TCT TCA. G^ G 42 ..G GSC aCC TCT TCA GaC G 

37 ... G^t TCC TCT TCAG2C G 47 ... G^C TCC TCT TCA G^C G 37 ... G^C aCC TCT TDV G^ 6 
22 ... .At TCC TCT TCA G^ G 42 ... .AC TCC TCT TCA G^ G 32 ... .AC aCC TCT TCA G^C G 

r : TCC TCT TC^ G5C G 42 C TCC TCT TCA G^C G 32 C aCC TCT TCA G^C G 

22 TCC *TCT TCA G=CG32 TCCTCTTCAG;:G32 aCCTCTTCAG^G 

zez'n^z') ihe t^iree columns of array (b) liited above, were two 
c:":-j-ns, in which bases 6 and 7 from thr tzfi were rr.issinc in 
rv=-. line. These seQuences all nielte<i a*. 20 cr 32 decrees. 
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Apparatus for cnaiysiric e poiynucl^:*:-:^ ir-^'jence, 
corr.pr i 3 ir.c c 3upport c^Q cttoched to a surface thereof 
an arrcY of tne whole or a chosen part O" a compieta 
set of 0 1 i conuc 1 ect i dcS of chosen lengths, the 
ol igonucleoiides being capable cf taking :art in 
hybridisation reactions. 

Z. Apparatus for studying differences between poly- 
nucleotide sequences, comprising a suppor: and attached 
to a surface thereof an array of the whole or a chosen 
Dart of a complete set of oligonucleotides of chosen 
lengths comprising the polynucleotide sei-ences, the 
0 i i gonuc i eot i des being capable of taking rart in 
hybridisation reactions. 

3. Apparatus as claimed in claim 2, wherein the array 
compris'es one or more pairs of ol igonucleztides of 
chosen lengths. 

4. Apparatus as claimed in claim 3, wherein the array 
comprises one or more pairs of oligonucleotides of 
chosen lengths representing normal and m'j.ant versions 
of a point mutation to be studied. 

5. Apcaratus as claimed in any one of claims 1 to 4. 
wherein the chosen length is from 8 to 21 nucleotides. 

6. Apparatus as claimed in any one of claims 1 to 5, 
wherein the surface of the support to whl:h the 
oligonucleotides are attached is of glas-. 

7. Apparatus as claimed in any one o: claims 1 to 6, 
wherein each oligonucleotide is bound tc -r^e support 
through a covalent link. 

8. A me'hcc of analysing a pc i ynuc 1 eot i de sequence, 
by the use of a support to the surface Z' which is 
cttachzC an array cf the whole or a chosen par* cf a 
comDle'.e sez of oligonucleotides cf chcse- lengths, 
w h : c n e - i'l c c comprises labelling the p o 1 > u c i e c t i d e 



PCT/GBS9/0()46a 



T c r. G 1 t 



= p 3 i y i r: g i r, ^ 



*, 1 0 n s 



levelled -Ti a t e r i c 1 under h y b r : c i j a : : : ' 
; h e 5 r r d y , e .? d c b s s r v : r; g the 1 o c : * : : 



the label cn the iurfece c«sociated with perticL-lcr 
rr.e.T.bers the set of oligonucleotides. 

9. A method ecccrding to cieim 8, applied to tr.e 
study of differences between polynucleotide secuen:es, 
wherein the array is of the whole or a chosen pert of 
the complete set of oligonucleotides of chosen lengths 
comprising the polynucleotide sequences. 

10. A method as claimed in claim 9. wherein the 2rray 
comprises one or more pairs of oligonucleotides 



n. A method as claimed in claim 10, wherein the irray 
comprises one or r.ore pairs of oligonucleotides c* 
chosen lengths reoresenting normal and mutant versions 
of a point mutation being studied, 

• 2. A meth'od according to any one of claims 8 tc li. 
wherein the polynucleotide sequence is randomly 
degraded to form a mixtura of oligomers of a chosen 
length, the mixture being thereafter labelled tc "orm 
zhe labelled material. 

i3. A method as claimed in ^^^im 12, wherein the 
oligomers are labelled with P. 

u A method as claimed in any one of claims £ 13, 
wherein the chosen length is from 8 to 20 nuc 1 ezt : -es . 
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